DOCUMENT RESUME 



ED 055 186 

AUTHOR 

TITLE 

INSTITUTION 

PUB DATE 
NOTE 

AVAILABLE .FROM 



f v VT 013 812 

Siegel,. Irving H. - , 

Aggregation and Averaging. 

Upjohn (H.E.) Inst.- for' Employment Research, 
Kalamazoo, Mich. 

May •6 8 • 

4,0 p . - 1 ' ' ’ 

The H,' E. .Upjohn Institute for Employment Research, 
.300 South Westnedge . A venue, Kdlamazoo, Michigan 49007 
(Single copies .without charge. Additional copies 
$1.50) 



EDRS PRICE 
DESCRIPTORS 



MF-$0. 65 HC-$3. 29 

* Economic Research; Employment Statistics; *Labor 
Economics; ^Measurement ; *Statistical Data 



ABSTRACT 

The arithmetic "'processes, of aggregation and averaging 
are basic to quantitative investigations >. of employment, unemployment, 
and related- concepts. In explaining these concepts, this report 
stresses need' for accuracy and consistency in measurements, and 
describes tools for analyzing alternative measures. (BH) 

. • i > 






' O . 

ERIC 



E0055186 



Methods for Manpower Analysis 



No. 1 



AGGREGATION AND 
AVERAGING 



:bl 



By 

IRVING H. SIEGEL 



U S, OEPARTMEIt ? OF HEALTH. 

EDUCATION & /WELFARE' 
OFFICE OF EDUCATION 
THIS DOCUMENT HAS BEEN RCFRO- 
DUCED EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIG- 
INATING IT. POlNTS.OF VIEW OR OPIN- 
IONS STATED 00 NOT NECESSARILY 
REPRESENT OFFICIAL OFFICE OF EDU- 
CATION POSITION OR POLICY. 



Mav 1968 



O . 

ERLC 



The W. E. Upjohn Institute for Employment Research 
300 South Westnedge Avenue 
Kalamazoo , Michigan 49007 



The Board of Trustees 
of the 

W. E. Upjohn Unemployment Trustee Corporation 

harry e. turbeville. Chairman 
LAWRENCE N. UPJOHN, M.D., Honorary Chairman 
(deceased) 

E. GIFFORD UPJOHN, M.D., Vice Chairman 
DONAip s.. gilmore. Vice Chairman 
d. Gordon knapp. Secretary- Treasurer 
MRS. GENEVIEVE U. GILMORE 
ARNO R. SCHORER 
CHARLES C. GIBBONS 
PRESTON S. PARISH 



C. M. benson. Assistant Secretary-Treasurer 

• .'‘A .. ’* 

The Staff of the Institute 

\ , 

HAROLD \C. TAYLOR, PH.D. 

' Director 
Kalamazoo 

HERBERT E. ‘ STRINER, PH.D. 

Director of Program Development 
Washington 1 

Kalamazoo Office 

RONALD J. ALDERTON, B.S, HAROLD T. SMITH, PH.D. 

SAMUEL V. BENNETT, M.A. HENRY C. THOLE, M.A. 

EUGENE C. MCKEAN, B.A. w - PAUL WRAY 



Washington Office 



E. WIGHT BAKKE, PlI.D. 

A. HARVEY BELITSKY, PH.D. 

SAUL J. BLAUSTEIN, B.A. . 
ROSANNE E. BURCH, B.A. 

SAMUEL M. BURT, M.A. 

IRVING H. 



SIDNEY A. FINE, PH.D. 
HENRY E. HOLMQUIST, M.A. 
J. E. MORTON, PH.D. 
MERRILL G. MURRAY, PH.D. 

' HAROLD L. SHEPPARD, PH.D 
., PH.D. 



Foreword 



This paper. Aggregation and Averaging, by Dr. Irving H. Siegel, oi the 
Washington Office of the Upjohn Institute, inaugurates a new series on 
Methods for Manpower Analysis. The senes represents an expansion 
of the scope or the^nstitute’s research and publication program. 

The papers in die series are intended to reflect the state or art and to 
have tutorial value. They will deal with methods applicable to manpower 
Sirs well as to method, actually u,ed. They wtll often lake ad- 
vantage of original research, as Dr. Siegel’s paper does. 

Harold C. Taylor 
Director 



Kalamazoo , Michigan 
May 1968 








* 



Preface 

The subject of this paper, the first in the new scries on Methods Tor 
Manpower Analysis, has been of long interest to the author- It is funda- 
mental to all quantitative investigations of employment, uiniemployment, 
arod related coneepts. 

An effort has been made to appeal to the interests and needs of read- 
ers at different lewels of sophistication. A quotation from Alice's Adven- 
tures in Wonderland that ctPiild have served as an epigraph to this paper 
guided the selection and presentation of material and pcfereiacvs 

And whaf igtVG'ia/H giF) she’ll think me far asMtvgJ No, it’ll 
never do sto ask; perhaps I shall see it written up somewhere.. 

cCommeiBls from readers are invited so that the value of any subsea- 
quern verssan of this paper to makers and users of manpower measures* 
may be enhanced' 

n 

' Irvjsng H. SiegeU 

Washington, D.C. 

Mat\> 1968 



\ 



/ 








/ / 



j Contents 

Foreword 

Preface 

i. Scppe of Paper ....... 

II. A Prologue on Measurement 

III. Aggregation* 

IV. Averaging . 



V. Consistent Aggregates and Averages 20 

VI. Tools for Analyzing Alternative Measures 23 

VII. Toward Better Design and Use of Manpower Measures ...... 26 

Appendix: Summation Symbols and Rules v 28 



i 



-a 

\ 





v 



V 



6 




The W. E. Upjohn Institute 
for Employment Research 



the institute, a privately sponsored nonprofit research 
organization, was established on July 1, 1945. It is an 
activity of the W. E. Upjohn Unem ploy ment Trustee 
Corporation, which was formed in 1932 to administer a 
fund set aside by ithe late Dr. W. E. Upjohn for the pur- 
pose of carrying on "’'research into the causes -and effects 
of unemployment and measures for the alleviation of 
unemployment/’ 



charge from the Institute i azoo or from the 

Washington office, 1101 tth Street, N.W., 

Washington, D.C., 20036. al . copies may be 

obtained from the Kalamazo t a cost of 50 cents 

per copy. 



*\ 



One copy of this bulletin 



obtained without 




Aggregation and Averaging 

/ • 

/. Scope of Paper 

Several reasons may be cited for linfeing the arithmetic processes of 
aggregation and averaging. Fir3t of all, these processes are basic to 
manpower measurement. They underlie other, more complex, sequences 
of numerical operations and, accordingly, are encountered daily and 
everywhere. They are close mathematical relatives, describe somewhat 
similar algebraic structures, and yield numbers that are readily con- 
vertible into each. They provide complementary, though partial, quanti- 
tative descriptions of an ensemble or population of discrete elements. 
The descriptions are partial in that they focus on but one common prop- 
erty or dimension of the elements at a time. They are complementary 
in that aggregation views the ensemble as a composite, a whole, while 
averaging characterizes the ensemble in terms of a representative ele- 
ment. 5 

The goal of aggregation or of averaging is the provision of a single 
summary magnitudes — an aggregate 1 in the first case, an average (also 
called a mean or mean value) in' the second. Both processes combine 
measures for the elements. These measures refer to a common attribute, 
the one originally selected for an assignment of element numbers or a 
derived unit introduced by weighting. Measures derived for the elements 
by weighting are expressed in unit or dimension that is presumably 
more appropriate for the purpose < f comparison and combination. The 
final result is a weighted aggregate or a weighted average . Since ag- 
gregates and averages are single figures, they no longer tell anything 
about the dispersion of the numbers* corresponding to the discrete ele- 
ments. i *•/ V* > > • 

When two or more properties of each element are of simultaneous 
interest, their measures have to be reduced to a common- denominator 

1 The usage of aggregate in this paper differs from that sometimes encountered not only 
in mathematical literature but also in statistical writings (e.g., in books by A. C. Aitken and 
W. G. Cochran). In the early vocabulary of ”set theory,” which the standard treatises of 
Pierpont and Hobson on real-variable functions helped to propagate, the word was com- 
monly used instead of ^ensemble? Enough other synonyms arc available, however,^ for the 
latter-^— e.g.» c/flw, groups pop ulat ton , set, and universe, The prefix sub is also applicable' to 
these words for a distinction between the ensemble and a part (larger than a discrete ele- 

ment). 1' . ■' * r " 

While wc are;; ^occupied with matters of terminology, wc should also note that the ulti- 
mate (discrete) elements of an ensemble may be called individuals, items, or members . The 
common property in terms of which the elements arc quantitatively evaluated in the first in- 
stance may also be designated an attribute, characteristic, concept , quality , variable , or var- 
iate^Sthc ^ resultin g numbers arc figures , magnitudes, and measures. In their subsequent 
transfo irnatroi^', the n u mbe rs remain expressed on a common scale — in a common denomi - 
natorildimerisibh, ox -unit. ' ■ ‘ 



1 




before aggregation or averaging can proceed. .The required preliminary 
step may entail weighting or a more formal scalarization of vectors. 
Another possibility is the derivation of conversion coefficients for the 
several variables from a fitted multivariate regression (or response) 
function. This method lies outside the domain of the present paper! 

This exposition is also circumscribed by two voluntary assumptions — 
that the elements are numerically describable without errbrs of observa- 
tion and that sampling is not required. Other papers in this series will 
deal explicitly with issues of statistical inference — with the treatment or 
interpretation of data as samples drawn from real or hypothetical popu- 
lations. Thus, in modem jargon, the viewpoint adopted here is deter- 
ministic rather than stochastic or probabilistic . Since connections with 
other provinces of analytical interest are too important to overlook, how- 
ever, the reader of this paper will be reminded later- that the statistical 
estimation of aggregates is a familiar and important topic of sampling 
theory and practice; and that the various kinds of averages may be viewed 
as “most probable’’ values for different “laws of error” or for different 
least-squares models. 

Concentration on “100 percent sample s^_o r. on measurement without 
observational error hardly leaves a meth^logy paper devoid of signi- 
ficant issues. Matters of concept, definition, and dimensional propriety, 
for example, are always vital. They cannot be salutarily ignored since 
bad decisions or loose administration in data gathering may introduce 
biases for which subsequent compensation is not easy if at all possible. 
In this connection, it is pertinent to cite the position taken in the 1930’s 
by a well-known statistician on a proposed enumeration of the unem- 
ployed; namely, that: . 

. ..even a LQD percent sample could not give 5 percent accuracy be- 
cause of differing ideas regarding definitions of unemployment and 
the interpretation of the questions. . . . Before it is profitable to talk \ 
of reducing sampling error to 5 percent, it would be necessary to re- 
duce both the variability in response (by sharpening the definition) 
and the error of enumeration to magnitudes comparable with 5 per- 
. cent accuracy. 2 • 



Long strides have since been taken in the design and use of official 
manpower statistics, but impressive nonsampling difficulties persist. 
Conspicuous gaps in industry coverage remain evident. Concepts and, 
measures that are not strictly compatible have to be used- frequently Wr 
want of better. Resort must often be had to indiredf* techniques for^ > 
measuring concepts that are unclear in the first instance th4| .> can be 

approximated only crudely at best. Many of^ these difficuRieS> are re^ 




Attributed to Frederick F. Stephan by W. E. Derning, Some Theory off Sampling (New * 



Vrtrk: Dover reprint, 1966), p. 39. \ 






ERIC 



u 




i 



t 

fleeted in hearings and prints of the Joint Economic Committee, in 
Economic Reports of the President , and in numerous other authoritative 
publications, such as the report issued in 1962 by a Presidential com- 
mittee^ . * 

Confusion may arise occasionally from the availability of more than 
one series for the '“saTme w manpower' concept, but this semblance of 
duplication is rare and is* not ne'eessarily deplorable. The Presidential 
committee dwelt at some length, for example, on problems engendered 
; by the existence of two rionagricultural employment measures,, one de- 
rived from a survey of firms and the’other from a survey of households* 
A 'multiplicity of manpower measures, however, is atypical;, and, besides, 
it can serve well the- needs oFdiscriminating analysis. If scarce statistical 
resources cannot realistically be redistributed in a clearly preferable 
manner, the remedy for. apparent duplication’ is not the reduction of 
information but^ the better instruction of the public concerning the 
nuances of meaning. / * ^ 

A basic idea of •this / p'iper is that alternative choices of data,, units, 
formula, and weights^ lead to measures for concepts that are cognate 
rather than the* “same.” These measures may seem equally eligible if 
no context or purpose is specified; but they actually are distinct, members 
of a family, have variant meanings, and can differ significantly in m£g- ‘ 
nitude. * 

Ideally,, the use to be made of a manpower measure and a knowledge 
of the other -variables to be measured jointly should govern our choices, 
but the “.prefabricated 1 ” data or series that already exist are often -the 
only ones that are practicably available. Limitations. in the supply and 
quali^ty^of data and of. series handicap analysis and should not be ignored 
in interpretation. In particular, the myth that existing statistics are 
“general r purpoSe” measures should not be taken too literally by the user. 

Althougfi ‘this paper makes reference to various manpower concepts for 
which time series are available, the discussion does not focus bn temporal 
change. Index numbers and manpower projections, are principal sub- 
jects, instead, for other pamphlets of the Upjohn Institute. The treatment 
of aggregates and averages here’ is intended to lay a basis for the exposi- 
tions of these and other more complex methods for manpower analysis. 

V’ . II; A Prologue on Measurement 

f J An observation made by John Locke in An Essay Concerning Human 
/ Understanding (1690) provides a fitting introduction to a paper* on basic 
" numerical processes. Concerning “unity or one,” Locke remarked that 

3 See report of President’s Committee* to 'Appraise Employment and Unemployment 
Statistics, Measuring Employment and Unemployment (Washington: *1962), especially 
Chapter 4 and Appendix I; and Oskar Morgen stern , *On the Accuracy of Economic Statis- 
tics (2n<fed.; Princeton: Princeton University Press, 1963), Chapter 13. * 



3 




/ 







V. 




( 



. it is the most intimate to our thoughts, as well as it is, in its 
agreement to all other things, the most universal idea we have. For 
number applies itself to men, angels, actions, thoughts,— everything 
that-either doth exist or carfbe imagined . 4 



Of primary interest here is the applicability of number to Vmen’— spe-.- 
cifically, to the summary description of such manpower concepts as 
labor force, .employment, unemployment, payrolls, hourly earnings, 
weekly hours, man-hour productivity, and unit labor cost. 

Varying degrees of commitment may be- discerned in the application 
of number to Tnanpower (and other) concepts . 5 The weakest degree is 
the use of number for mere identification and classification — fQr the 
differentiation of individuals *rom each other and for grouping them into 
more or less homogeneous categories. Thus, distinct serial numbers may 
be assigned to the employees on a company payroll, to‘ the members of 
the military services, and to the registrants under the Social Security 
system. Coding digits, furthermore, may distinguish workers in one de- 
partment from those in another, officers from enlisted^ personnel^ men 
from women. A stronger use of number is to rank the members of an 
ensemble according to magnitude with respect to a common observed- 
or derived property. Thus, with respect to a selected property, we may. 
say that A is greateV than (>), less than (<), or equal to (-*) B\ and 
that, if A >B and B > C , then A > C . These relationships involve 
symmetry and transitivity according to the terminology of logic . 6 A 
third variety of numerical application comes closer to true measurement: - 
It permits ..the comparison of magnitudes, not as . absolute totals, but 
with regard^fo differences. A unit that is fixed in meaning or nearly so 
has to be available for the determination of an excess or deficit. 

True measurement is stricter than any of the foregoing applications 
of number, for it assumes a scale having both an origin or zero and a rigid 
unit. It assumes that the fundamental operations performable on pure 
numbers have clearly interpretable or manifestly plausible counterparts 
in the treatment of our manpower Concepts. The operations are addition- 
subtraction, multiplication, and division (except by zero). In the language 
of mathematics (especially of set and group theory) and of logic, an 
isomorphism , or structural equivalence, is assumed between the domain 
of possible magnitudes of a manpower concept and the co-domain of 



ERJC 



4 Book II, Chapter 16, Article 1, of LockeVs classic. 

5 This’ paragraph and tlje next two take some account of ideas presented by S. S. Stevens, 
“On the Theory of Scales of Measurement,” in Arthur Danto and Sidney Morgenbesser, 
eds.. Philosophy of Science (New York: Meridian Books, I960), pp. 141-199; N. R. Camp- 
bell, Foundations of Science: The Philosophy > of Theory and Experiment (New York: Dover 
reprint, 1957); P. W. BridgmanV The Way Things Are (Cambridge: Harvard University 
Press, 1959), pp. 135-137; and M. R. Cohen and Ernest Nagel, An Introduction to Logic - 
and Scientific Method (New York: Harcourt, Brace, 1934), Chapter 15. 

6 Cohen and Nagel (footnote 5), pp. 297-298. 














V 






“real” (rational and irrational) numbers. A ‘‘one-to-one mapping” 
or “injection - *’ associates^, every manpower magnitude with the same 
number in the co-domain. 7 

Fof manpower measurement, it is preferable to adopt this strict 
scale in the first instance and subsequently to temper or to downgrade 
the implications of numerical operations (if necessary) by appeal to ad- 
ditional information and to common sense. The ratio scale , as it is some- 
times called, 8 “permits one to say not only that A is larger or smaller 
than B by a given amount but also how many units each one contains 
and what the relative magnitudes are. When crude data or techniques 
of estimation have been used, however, or when computations are 
subject to severe rounding, it is desirable not to “squeeze the numbers 
too hard.” Judgment is always in order even when it may be out of 
fashion. The comment cited earlier on the manpower census of the 
1930’s is also pertinent here. 

More than one ratio scale may be of interest in aggregation and aver- 
aging. It was noted at the outset that numbers expressed in a conven- 
tional or natural unit are assigned to the members of a manpower en- 
semble; that the original numbers may be converted, by the introduction 
of weights, to a common denominator deemed more relevant or more 
stable for the problem under consideration. Employment figures, for 
example, may be stated originally as numbers of people, but the problem 
may require translation of such figures into man-hour units. Within the 
c definition of workers or man-houfs that is adopted, the- aggregation or 
averaging process strictly implies that any worker or man-hour is 
equivalent to, and quantitatively exchangeable with, any other. A proper 
discount may have to be made, however, in interpretation. 

Corresponding to the measurement of totals (from zero) and of 
differences between totals are the notions ' of stock and flow; These 
two terms, occasionally encountered in writings on economic time series, 9 
are adaptable to the discussion of manpower aggregates and averages. 
A stock refers to a status or inventory— to a total quantity that is fixed 
at a point in time or selected as typical of a period. An* example is the 
number of workers reported by k'n establishment on Form 790 of the 
U.S. Bureau of Labor Statistics for the 4 pay period including the 12th 



7 Ibid. , pp. 137-141, on isomorphism . The term is also mentioned by Stevens and Bridg- 

man j (see footnote 5). Important in advanced mathematics, the concept is treated in stand- 
ard works on higher algebra (e.g., by BirkhofT and. Mac Lane), on sets and groups, and on 
matrices. Among the writings consulted in the preparation of this paper were J. A.' Green,. 
Sets a nd Groups (London: Routledge Sc Kegan Paul, 1965) and F. E. Hohn, Elementary 
Matrix Algebra (New York: Macmillan, 1958), especially the appendix on “The General 
Concept of Isomorphism,” pp. 288-290. f \ 

8 Stevens (footnote 5), pp. 147-148. \ f 



ERIC 



9 Among the few modem works using the terms stock and flow are R. G- D. Allen, 
acro-Economic Theory: A Mathematical Treatment f London: Macmillan, 1967), pp. 2-3. 



XT: 



12 




3 








day of a given calendar month. We may average such -stock figures for 
12 consecutive months to obtain one stock estimate for characterization 
of a whole year. A flow represents a (gross or net) change Ihat is re- 
corded during an interval in» a real or imaginable stock. An example is 
the number of man-hours worked during a month — a gross addition 
to a conceivable initial (per© aw positive) stocks Such flow figures may, 
unlike stock figures, rtsassmaSihy be combfined into an aggregate for an 
interval of time; thus, am annucaJ total of man-heurs worked is a meaning- 
ful measure. A change ira tfee number of workers on corasecutive pay- 
rolls is also a illow — a met difference between two stock figures. Not 
only afe flow figures cumiuJaHie but they may also iplausib&y be averaged 
—for example, a “typical” monthly man-hour total may !be derived for 
a particular year.from 12 monthly figures. 

Stock and flow figures of the same genus are connected by a formula. 
, For example: * 



O 

ERIC 



- Number of workers reported = Number of workers reported 
for a given month for preceding month 

-j- Gains — Losses. 

The two reportecl worker totals are status figures- or stocks. The gains 
represent the gross inflow of workers from one pay period to the next, 
the losses represent the gross worker outflow. The difference between 
gains and losses/is the net flow (plus or minus). 

It is a familiar plaint that -Treasure men t in the human disciplines lacks 
the definitiveness apparently achievable in the physical sciences and in 
the world of objects in general. Such employment units as the worker or 
man-hour, indeed, lack the stability or homogeneity of the meter (a 
unit of 1 length), the second (time), and the degree Kelvin (absolute 
temperature)./ Another way of. stating the situation is that manpower 
measurement] however precisely accomplished „ and however refined 
the unit we Choose, still fails (a) to reflect cogently and comprehensively 
the essence of a multidimensional 7 social phenomenon or (b) to reflect 
what measurement in terms of some other important property might be 
expected to ihow. * • ' / 

In manpower and other aggregation and averaging, it is desirable to 
distinguish I between “literal” and “verbal” algebra. The details of 
compositidn and structur^/c£ a summary measure are not strictly divorce- 

10 The distinction between literal and verbal algebra has been made by I. H. Siegel in 
various places— e.g., in "Systems of Algebraically Consistent Index Numbers,” ! 965 Busi- 
ness and Economic Statistics Section Proceedings of the American Statistical Association , 
pp. 369-372** “On ' the/ Design of Consistent Output and Input Indexes for Productivity 
Measurement,” in Output, Input, and Productivity Measurement (Princet<yu Princeton 
University Press, 1961), pp. 23-41; and Concepts and Measurement of Production and 
ProductivitJ^(Washington: U.S. Bureau of Labor Statistics, 1952). 

' i 6 / 



\ 



able from the circumstances in which such & nt&asture is to be .used_ 
, Different formulas and weighting schemes jvield results that rmav be 
significantly dissimilar. Two sets <of nurrasricaM- assignments — Sen the 
u same” unit differently defined are not nece^arfijv pprop'enaBonal to each 
other. Multiple plausible measures may be dasxme^H for a f&Hiiily of related 
concepts, but they are not casually in terchang^afclei. These, mecognized and 
unrecognized dangers give importance to literral (0%ebra, "swhich is con- 
cerned with, the design of a measure to accordl wjjkfe the pmirpose or con- 
text of use. It is also concerned with correct un t&r pset a ti mnJ It 'pays due 
regard, therefore, to the ingredients and of cconrst ruction of 

whatever measure happens to be used for icroF bettesr.'* Verbal alge- 

bra, on the other*, htand, is content with narrates labe&$ M '‘Any old” 
aggregate or average that permiits a proper '^CTMcdllatiom of words” is 
uncritically accepted. Among the possible unsatisfactory consequences of 
_ verbal algebra are dimensional eccentricity, tto which some attention 
will later be given, and the mistaking of noise fewmessage. 

Since the assignment of numbers to concepts is usually far from 
ideal, attention will alsd be devoted in this paper to algebraic tools for 
analysis of the relationship between alternative summary measures. As 
already noted, concepts, units, formulas, and weights should preferably 
be related to the purpose or context of measurement, which should also 
dominate the choice of adjustment procedures for overcoming limitations 
of available data or series. Furthermore, common sense may dictate a 
preference for one aggregate or average over another on mere dimen- 
sional grounds. Since differences in content, form, and method may signi- 
ficantly affect the numerical results, the maker or user of summary measr 
ures should consider the sources, magnitude, and direction of possible 
divergence. , 

III, Aggregation • 

Definition 



In brief, aggregation may be described as the derivation and sub- 
sequent combination into a sum of commensurable numbers correspond- 
ing to the elements of an ensemble . The summed numbers are defined 
on a ratio sGale. They are commensurable in that they have a common 
denominator; they do not literally have to be integers, exact multiples 
of a prescribed unit. •. 

More explicitly, aggregation entails (a) the assignment of numbers 
to a common property of the elements of an ensemble; (b) the adjustment 
of these numbers, if necessary, to overcome j^itatinms of the under- 
lying data or to o&fiect a refinement of coracefrt^ (c) th^weighting <of the 
original or adjusted numbers, if necessary* jiltWTCEBtoce tfetern to a common 
^denominator that: is deemed more homogwtteom^, moEC: stable, or more 





7 



. A 




1 




relevant to the aim of an exercise: and (d) the summation *>f the original, 
adjusted, or weighted figures. The sum is the aggregate, v he final result 
of the process^ It measures the size of an ensemble with ^reference to a 
pertinent observable or derived property. 

Aggregates differ according to the original choice of_a common at- 
tribute of ensemble elements and the subsequent modes; of refinement 
and weighting. When no weights are_ introduced, the nurmbers originally 
assigned to the elements are, in effect, equally weighted." Sometimes, it 
is analytically useful to rewrite a weighted or unweighted aggregate in 
an equivalent expanded form— that is, with “telescoping" weights. 



Formulas 

Counting is the simplest and most familiar variety of aggregation. 11 
Every element of any group has at least the attribute of discreteness, 
of “oneness." Thus, if a group has n elements^ the moist obvious ag- 

n 

gregate is 1 • ••-+- 1 = “ « X* * = n (° n summation symbols, 

see Appendix.) 

Although n — or any other aggregate measure — usually stands by 
itself, without a designated unit, it is not really a pure or abstract num- 
ber. Even in the simplest case, it has an implied unit — e.g., “elements” 
or “things." It may also represent a sum of people, employees, , man- 
hours, unemployed persons, or payroll dollars — i.e., zt sum referring to a 
manpower characteristic. . . 

./ When the elements have been grouped into subclasses, a weighted 
sum of subclass measures may be substituted for a completely fresh 
count. Each subclass is treatable as a complex element; it has a “oneness,” 
but its content of ultimate elements provides the weighting., factor, 
needed for a much better determination of the size of the ensemble. The 
symbol for the sum of elements in the s subclasses of uneqyal size is 
- *. _ , m = n. Again, a common dimension is implied — elements, things, 

• or some more explicit characteristic, such as number of employees. 

When employment is expressed initially in terms of workers and the 
preferred common unit is man-hours, a transformation of the original 
numbers is achievable with weights representing ^hours per worker. ’Thus, 
if m employees in the i tb industry’ work, say, hi hours per week and if 
there are s. industries altogether, the corresponding weighted aggregate 
is IP* mhi . The subscripts may be dropped if no confusion would re- 
sult: Ehn. ' • 

More complex situations are often encountered : in manpower ag- 



3 

ERIC 



11 Stevens (footnote 5), p. 147, observes: 

Foremost among the ratio scales is the scale of number itself— cardinal number the 
scale we use when wp count .... This scale of the nunierosity of aggregates is so basic 
and so common that it is ordinarily not even mentioned in discussions of measurement. 



8 



15 



"\ 



gregation, iand they are easy to handle. Thus. ui need imay anise to dis- 
tinguish the different companies within an industry and the different 
occupations or departments within a compmny — especially because 
of dissimilarities with respect to number of workers, hours of work, or 
some other relevant characteristic, such as hourly remuneration. More 
summation signs— or more subscripts, at least— then have to foe intro- 
duced. 

Let us consider a specific case involving a fixed number (s) of indus- 
tries, a variable number {f ) of companies in each industry, a* variable 
number (g) ^f departments in each company, and variable numbers 
of workers (/i) and hours per worker ( h ) in each department of a com- 
pany. The expression for total man-hours may tlhen 'be written very ex- 
plicitly as: 

* ft XU 



2 23 S flijkhijk 

i— I/— 1 ft — 1 



The symbols direct that^we first sum man-hours by department (sub- 
script k) for each company (subscript /), then sum the company results^ 
within each industry_(subscript /), and filially* sum the industry figures 
into a grand total. The procedui*e"''rnay be visualized simply in a “tree*" 
diagram (see Appendix). 

When there is no ambiguity, it is sufficient to write . This 

stripped version of the expression presented above focuses attention on 
workers and hours in department “cells.” The cells may be identified 
exhaustively, unequivocally, and without duplication by means of permu- 
tations of the industry-company-department subscript numbers. Ac- 
cordingly, aggregation may be accomplished directly and completely 
at the department level if we do not need also to have company and 
industry subtotals. 

This is a good place to underscone two points made earlier about . 
alternative aggregate ' measures tforr the same ensemble. Obviously, 
the size of a company as representedrdby man-hours worked exceeds the 
size in terms of the number of workers. Second, a percentage distribu- 
tion of 'man-hours by company department differs from a distribution 
based on numbers of workers if hours of work are not uniform. 

The next observation provides a bridge to the discussion of averages; 
but- a determined crossing will not be made just yet. If a company’s 
* ' hours of work are uniform from department to department, "the measure 
231- x nkhk simplifies to where 7t is the constant company wide 

figure for hours and of course, is the company’s number of workers. 

If subscripts are dropped and the two expressions are written as an 
identity, the result is " 




I 

3 




r 



N 




which has the form of a weighted average. "The expression on the righfl, 
of course, its mathematically otiose. 

An examination of alternative forms of a given aggregate (fixed im 
dimension and magnitude) also yields valuable insights. It assists in (the 
proper design of averages and of algebraically consistent aggregates. 
It protects against dimensional impropriety in the construction of sum- 
mary measures. Consider the fact that at least six different multiplica- 
tive verbal identities may be written for a payroll aggregate: 



O 

ERLC 



Payroll a* Workers X Hours per Worker X Hourly Earnings 
sb Man-hours X Hourly Earnings 
= Output X Unit Labor Cost 

== Workers X Hours per Worker X Hourly Productivity 
- x Unit Labor Cost 

== Man-hours X Hourly Productivity X Unit Labor Cost 
= Output X Unit Labor Requirements X Hourly Earnings 

Accordingly, if sufficiently ■ detailed -information is available for the 
commodity output, labor input, and worker remuneration of a company, 
industry, or larger sector of the economy, it becomes possible to express 
the corresponding payroll aggregate as a sum in at least six different ways 
(subscripts omitted): 

Snhe . " • 

Sme 

Sqc ■ 

Snhpc 

Snipe 

Sqre 

The meanings of the italic letters become clear upon comparison of 
verbal and algebraic equivalents. 

Can the payroll aggregate also be expressed in different ways as a 
product of aggregates and averages for the very same characteristics that 
enter into the verbal identities? The answer is yes, provided that averages 
and aggregates are weighted with care; they cannot be of “any old” 
variety. The problem demands meticulous literal algebra; casual verbal 
algebra will not do. Dimensional sense imposes an additional constraint. 
This paper later shows how a payroll identity may be written in terms of 
appropriate aggregates an d ave rages. '-.-.-I . 

Although sums of unweighted and weighted logarithms have not been 
discussed here*, they are encountered, as well as aggregates involving, 
ordinary numbers, in manpower analysis. "When logarithms are summedi, 
the aggregate represents the logarithm of a product; and a weighted sunn 
of lo ga rithms corresponds to the logarithm of the product of numbers 
that have been raised to powers (the powers are the weights). Logarithm 




mic aggregates., as shown below, are pertinent to the discussion of 
geometric averages. 

Estimation From Samples 

Since population aggregates often have to be estimated from sample 
information, some illustrations are offered. The topic also belongs to the 
province of survey techniques and accordingly is eligible for treatment 
in other contributions to this series on Methods for Manpower Analysis. 

The usual objective in estimating an -aggregate is to obtain a figure 
that is unbiased and has a tolerably small, if not the least possible, sam- 
pling variance. The procedure is unbiased if its “expected" result, in a 
statistical sense, is the same as the' population total. The sampling plan 
and the estimation procedure have to be coordinated closely if bias is to 
be avoided, restricted, or compensated, and if the variance is to be kept 
within acceptable bounds. 12 

Suppose that F companies comprise an industry and that /“of them are 
to 'be samplecL with a view to estimation of total employee man-hours. If 
all companies have equal probabilities of inclusion in tfte sample, and if 
man-hour data are obtained and used for uill the workers in a selected 
company (/?/«), then lFff \ . nn provides an unbiased estimate of total 

* • r 

industry man-hours. The “blowup factor,’’ j-, is also called a “sampling 

ratio,” a “weighting factor,’’ or an “expansion ratio.” 

Other sampling schemes may also yield unbiased estimates. Suppose 
that in each of ~f companies selected with equal probability an employee is 
picked .at random. If his hours are hi , and if his. company has ru em- 
ployees altogether, an unbiased estimate of total industry man-hours is 

given by {F/f) 1 nihi' .. , 

If the F companies in the industry' have unequal probabilities of selec- 
tion {pi), and if f companies are drawn at random, we may design an un- 
biased estimate of aggregate man-hours for this case also. The . estimate is 

( 1 / _ , ~~ , where ' - Tilt represents the man-hours of the i tH selected 

company. It can be shown that, if only one company were to be sampled 
(i.e:, f = 1). the variance of the estimated aggregate is reduced to zero 
when the />,; are proportional tc the />/,.. Attempts accordingly are made in 
practice to approximate such pi values — e.g., on the basis of man-hour 
figures derived from earlier surveys. 

We conclude this section with an illustration from “multistage sam- 
pling,” a technique largely developed for population surveys but easily 
adapted to manpower studies in general. A monograph issued in 1947 by 
the U S. Bureau of _ the. Census shows the following formula (original 

m — * t 

Concerning this paragraph and -the next three, see, for example, Denning (footnote 2>. 
pp. 87-99. 

It 

vi 18 - ■■ . 



symbols) Tor an unbiased estimate, .v*, of “the population contained within 
the [city] area covered by block-sampling": 



X* 




Here, the subscript / refers to one of the R strata of the city, j to one of 
the Af city blocks, and k to one of the N dwelling places in the city; the 
m t and w,/ refer to the sampled blocks and sampled dwelling places, re- 
spectively; and the sampling ratios, — ' and , are assumed to be known. 

nit n if 

The formula may obviously be adapted to' the estimation, fqr example, of 
employment in an ensemble of R industries by means of a sample^ of 
workers in /*<> companies chosen within each of nn particular industries. 



/ K. Averaging 



Definition 

The process of averaging yields a number that is of the same order of 
magnitude and is expressed in the same unit as the numbers for a com- 
mon property of the ensemble elements. The element numbers are the 
ones originally assigned or obtained by a subsequent | adjustment, but 
they do not yet reflect weighting. The effect! of any weighting that is in- 
troduced to make the element numbers mnost meaningfully additive for 
aggregation has to be reversed, in a sense, in the course of averaging. 

An average (of positive numbers) is smaller than an aggregate; yet it, 
too, characterizes an ensemble. Thus, its derivation takes account of 
every item in a group, although the original element numbers are not 
retrievable. Furthermore, it is sensitive to the choice of weights and to; 
the structure of the combining formula. Most important is the representa- 
tiveness of an average from a mathematical standpoint: its substitutability 
for the number corresponding to each element. 

The last point is usually reflected in definitions of averaging and aver- 
ages. Cognizance is taken of it when averaging is defined as the process 
of deriving, from an aggregate measure or front the measures assigned to 
the elements according to a selected common attribute , a single number 
that is representative of the elements and is mathematically substitutable 
for each. More simply, averaging is the derivation of a representative 
number that leaves an aggregate unchanged when it is used in lieu of 
the measures assigned to all the elements with respect to a selected com- 
mon property. Other criteria have also been employed in the definition of 
averages, but the notion of substitutability serves our need adequately. 

1:1 On this paragraph, sec A Chapter in Population Sampling, by the Sampling Staff of 
tlhe U.S. Bureau of the Census (Washington: 1947), pp. 16-20. 

12 * 




Formulas' 

The criterion cf mathematical substitution may be stated formally: X 
is an average of ati, * * • , relative to a Function, d>, of these n measures 
if 0 (jci * jc*) — d>(A\ A*). 1 . 4 Thus, A" == A. the arithmetic mean. 

when jci+ • - * -h « A ~h *•*-+- ^4; for then, -x = /*<4, so that A = : 

'£xfn. Th e geometric mean corresponds to A — G when _vi • • • — 

G • • *G; for then, IIjc, — G n , so that G = v^Ilv. For the harmonic mean % 
we have ' 

— + h ~ = 4? + ' - ' iPTJ' or 2(1/.y) = n/H, so that 

AT 1 AT/t mm _ 4 

^ // — n/X(\/x) = /i/Sx* 1 . 

(The exponent — l signifies a reciprocal.) 

The second definition of. averaging, the simpler statement that em- 
phasizes the invariance of an aggregate to substitution, obviously yields 
equivalent results.. Thus, the replacement of every Xi by A — 2; x/n in the * 

aggregate produces no change. If we start with the aggregate 2Mogx 
and substitute log G = (1 /ri) £ log jc, for^every log x, y we return to ^logx. 
Similarly, Sx* 1 is invariant to the substitution of H~ l for every xr. 1 . 

A general formula is available that includes these three classical means 
(and others too, such as the „ root mean square) and that meets the sub- 
stitution criterion. This formula is: 




where t may take any value. When t = 1, this expression reduces to A; 
when t approaches zero, the limiting value of Xt is <7: when t = — 1, the 
result is H\ (The root mean square corresponds to t = 2.) If every . Xi is 
replaced by A = Sx/«, the 'generalized expression yields Xt = A . The 
expression yields Xt = G and Xt = H when G and" //, respectively, are 
substituted for every x #. IG 

A famous inequality for aggregates, due to Holder, may be modified 
slightly to refer explicitly to generalized unweighted means. For two sets 
of variates, Xi and _r*, the relation. 




O 

ERIC 




14 Sec E. L. Dodd, Lectures on Probability and Statistics (Austin: University of Texas 
Press, 1 945), pp. 20, 29, and 40. 

14 The generalized formula is cited by E. V. Huntington* •‘Mathematical Memoranda,'’ 
in H. L. Rietz, cd.„ Handbook of Mathematical Statistics (Boston: Houghton Mifflin, 1924), 
p. 6. It is also shown by Milton Abramowiiz and Irene F. Stegun, cds.. Handbook of Mathe- 
matical functions* Applied Mathematical Series, No. 55 (Washington: National Bureau of 
Standards, 1964), p\ IO; and G. H. Hardy, J. E. Littlewood, and G. Polya, Inequalities 
(Cambridge: Cambridge University Press, 1934), pp. 12-13. K 

u J 3 20 

. v * ■■ ‘ ■ A 






n 





i f 

, holds 



when / > 1, /' > I, and these exponents satisfy the conjugacy condition, 

1 ft \~/r' = 1. The Cauchy-Schwarz inequality is obtained when / = /'=* 

2. When / = /'- = 1, the expression becomes , 

gjyr > 2Zx m 2T , 
n < /i n 



and the left member exceeds, is less than, or equals the right member 
according as the correlation between jc, and r. isv positive! negative, or 
zero. In particular, when the x* and the Vi are unequal and similarly 
ordered (i.e., correspond, exactly in rank), the left member must exceed 
the product of the two unweighted arithmetic means on the right 
(Chebyshev’s inequality). Reference will be made again to Holder’s 
inequality. 16 ^ \ 

Several remarks on the familiar unweighted average^ are in order: 

1. All are internal means; that is, each average" lies between the least 
and the greatest of the x ,. Externality is a familiar hazard in work with 
index numbers, 'as a later paper will show. Furthermore, as will be noted 

-below, the possibility of externality is sometimes built into mathematical 
“production - functions,” which usually attempt to relate output to the 
contributions of manpower and capital. Finally, when aggregates and 
averages are improperly matched in multiplicative identities, an attempt 
to adjust the averages (to assure identity) could also lead to externality. 

2. It still does not seem to be generally, known that an unweighted 

arithmetic mean may be written as a specially weighted harmonic mean; 
and, conversely, that a harmonic mean may be viewed as a weighted 
arithmetic mean. This fact is, not only of pedagogic interest but also has 
analytical value. ^ 

3. Since ancient times, it- has been known that, when the Xi are unequal 

(and positive), the arithmetic mean exceeds the geometric mean, which 
in turn exceeds the harmonic mean. Instances in which this proposition 
is applicable; however, are not always recognized. ’ 

4. When . the xy‘ are equal, all the unweighted means are equal to ay. 
This is an extrerrjje case of the substitution criterion. % 

^ Before tqrningv-to weighted averages, we note two other measures of 
“central tendency” or “location” that^are treated in the first chapters of 
an elementary statistics text: the mode and' the nfedian* The nfode of a 
’frequency distribution or “histogram” is the value of the variate cor- 

— — V . . : / \ * • 

,# .Or| H51dcr‘s and Chcbyshcv’s inequalities, sec, ’for ‘example. Hardy, Littlewood, and 
PoIyi f (pr5CC(jing footnote), pp. 24* : 26, 43 -44. The genera Usability of Hdlder’s expression to 
three or more variables noted by Rao, Linear Statistical Jnference and Its Applica- 
tions (New York: Wiley, 1965), p. 44. " ^ ’ V f < 



v 



responding to maximum frequency. This measure is most meaningful 
when only one maximum clearly exists — a case often encountered in 
manpower statistics. ^ It is not meaningful when distributions^ exhibit no 
bunching at all or when two'or more major concentrations 'of frequency 
are evident. Some authors would distinguish the "absolute mode'’ from 
“Telative modes” when a multimodal distribution literally has one peak. 11 
The median divides the total frequency of a distribution into two equal 
parts. When the number of values is even rather than odd, the deter- 
^mination of the mediajx_in-volves some arbitrariness. A mathematical 
generalization— is^avaTlable that embraces the ‘median, mode, and un- 
weighted arithmetic mean. 1 * * * ■ 

x The notion of substitutability applies to weighted, as well as unweighted, 
'averages; and the* inequalities that hold tor unweighted means also hold 
for positively weighted ones. The invariance of the weighted aggregate 
2wx to a replacement of- every 3c, by the- weighted arithmetic 

mean, is immediately evident. Simifarly, the weighted' geonietric mean, 
(n/j 1 may be introduced in lieu of the Xi into ITv u without any mathe- 
matical .effect. This replacement is equivalent to the use of 2)udogx /£ w 
for every log x, in the aggregate Svvlogx. The weighted harmonic average, 
2w/2(h>/x), likewise satisfies the substitution criterion. 

Even as one formula may be written to embrace the familiar un- 

e weighted averages, a generalization exists that . subsumes the common 
weighted varieties and many others: 



ERJC 

mBUS.TilLU 



* m ' 



Again, the specialization of / is the key to the different means. 151 This for- 
mula is of interest in the study of production functions. 

A short accompanying table illustrates the variation in mean values 
achievable for a simple array with different formulas and weights. The 
numbers being combined are 1, 2, 4, and 8. In one case, no weights (i.e., v 
equal weights) are used. In the second case, the lowest number in the 
array* receives double weight. In the final case, the^ highest number is 
doubly weighted. The order of the several means remains unchanged 
from case to case. Although tlje table may not reflect the variability to 
which manpower calculations are actually subject, it does support our 
view that users, as well as makers, of statistics should give due attention 



17 Bernard Ostlc, Statistics in Research (2nd ed.; Ames: Iowa State University Press, 
1963), pp. 58-59. 

,a Huntington (footnote 15), pp. 6-7, cites Dunham Jackson’s elegant expression for the 
median and a generalization due to Jackson and R. M, Foster. 

iy Hardy, Litllewood, and Polya, op. V/7., pp.. 13-18. This work cites other averages, such 
as Muirhcad’s (pp, 44-45), and refers to the important notion' of convexity. whicl^ illumi- 
nates the study of inequalities among averages. 



15 



l l; 



22 






to the choice of formulas and weights and to the analysis of inter-mean 
differences. 



Alternative Means a 



Mean 


Unweighted 


Weighted: / b 


Weighted: If c 


Harmonic, / = — 1 


2.13 


1.74 


2.50 


Geometric, t — * 0 


2.83 


2.30 


3.48 


Arithmetic, t = 1 


3.75 


3.20 


4.60 


Generalized, t = 2 


4.61 


4.15 


5.46 


t = 3 


5.27 


4.89 


6.03 



a The numbers being averaged are 1, 2, 4, and 8. 

b The weights assigned to the four numbers are 2, 1, 1, and 1, respec- 
tively. * 

c The weights are 1, 1, 1, and 2. 

The progression exhibited by the means for Increasing values of t in 
each of the columns is not an accident of data selection. Higher values 
of t do, indeed, correspond to higher weighted and unweighted means 
(for positive xi and vv,). Statisticians who recognize this theorem may 
state it in terms of expected values or moments . 20 

Dimensional Propriety 

# ~ # 

In manpower analysis, weighted arithmetic averages are encountered 
all ,the time, and weighted harmonic means' are often used without being 
identified as such. If firms or industries have different hours of work, a 
‘‘logical” average of these hours is of the arithmetic' variety and incor- 
porates employmerft^^eights. The resulting expression, Xnh'/Hn, is di- 
mensionally very .acceptable; the numerator is expressed in m^n-hours, 
a conventionally additive unit, and the denominator is expressed in em- 
ployment, another conventionally additive unitJ Furthermore, verbal 
algebra, which features the cancellation of worps, makes it clear that the 
formula provides a measure of hours of work. Note that Xnh/'Zn m^jy-also 
be written as 2nh/2(nh/h); it is a “telescoped” version of a harmonic 
mean of hours of work with man-hour weights. Hence, the harmonic 
mean is also “logical” for combining hourly figures . foY different firms 
or industries, but it has to incorporate suitable weights. 

Is it always easy to tell, if an average is “logical” for combining the 
measures of elements with respect to a certain attribute? Yes, two tests 
are applicable, even though we cannot always implement our preferences. 
First, unless , a context prescribes otherwise, both the numerator and 
denominator ought to be expressed in additive units. The joint measure- 

20 Ibid. f pp. 26-27; and Michel Loeve, Probability Theory (3rd ed.; Princeton: D. Van 
Nostrand, 1963), p. 156. 



merit of several variables within the context of a verbal identity^ may 
sometimes oblige acceptance of some curious aggregates, but eccen- 
trically weighted means should not be sought for their own sake. Second, 
the ratio must, on performance of the indicated verbal algebra, disclose 
the property selected for averaging. Suppose, for example, that payroll 
weights are used in an arithmetic mean of hours of work. The numerator 
becomes a dimensional mess (2/i/i e) that no reasonable verbal identity 
would require; it is not expressed in a meaningful unit although the 
denominator is. This awkwardness should be enough to . rule out the 
average even though, according to the verbal algebra, it is a composite 
indicator of hours of work: and even though the operations of arithmetic 
are also correctly performed. * 

Do these remarks suggest that payrolls have no place in the measure- 
ment of average hours of work? Not at all, but the approach has to be 
cautious. Let us reverflir'The^first of the payroll identities shown in the 
discussion of aggregation formulas, namely: 

Payroll = Workers X Hours per Worker X Hourly Earnings. 

Within this framework, all three characteristics may be measured com- 
patibly for the same ensemble. Multiple measures' may be devised for 
each - characteristic, and at least one set makes good dimensional sense 
for all three. 

Is there any manpower characteristic for which, payrolls constitute a 
most “natural” weight? Of course, for hourly earnings, but a .harmonic 
formula has to be used. Lelting ir represent the payroll of the /'“‘company 
or industry of an erisemble, we write 2 2r/2(r/c). This harmonic expression 
is transformable into a weighted arithmetic ^mean of hourly earnings 
“with ^man-hour weights — which association is hinted in the Second of the 
six identities presented earlier for payrolls. The equivalence is clear if we 
first rewrite the harmonic average as 'Zme /^(me/e) and then simplify 
to .obtain 2m£*/2m. Clearly, the numerator has the dimension of payrolls, 
and the denominator refers' to man-hours. Each of these units is conven- 
tionally additive. Verbal algebra verifies that the quotient represents 
average hourly earnings. 

Do weights and the numbers being averaged have to be perfect dimen- 
sional “mates”? Preferably, yes; and, when approximations to the 
logically desirable weights have to be used, the choice still ought to make 
tolerable dimensional sense. For example, employment weights might 
plausibly substitute for man-hour weights in anVarithmetic mean of hourly 
earnings; but it would be foolish to weight by, Sciy, mart-hour productivity 
instead, or by its reciprocal, unit labor requirements. Awareness that a 
relevant ^accounting identity should be satisfied provides a guide to (a) 
good literal algebra (which is concerned with the content and structure of 
measures) and to (b) good dimensional sense While it also assures (c) 
satisfaction of the less stringent demands of verbal algebra. 



